{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "dd9412ba-0ef7-4018-880c-5dc29f0ff39d",
   "metadata": {},
   "source": [
    "# Basic Usage: DiffPrivateSimpleDatasetPack"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a3bb0e19-160d-45d3-888d-95bfbc0933ad",
   "metadata": {},
   "source": [
    "In this notebook, we demonstrate the basic usage of `DiffPrivateSimpleDatasetPack`. The purpose of this pack is to create privacy, safe synthetic observations (or copies) of an original, likely sensitive dataset.\n",
    "\n",
    "### How it works?\n",
    "`DiffPrivateSimpleDatasetPack` operates on the `LabelledSimpleDataset` type, which is a llama-dataset that contains `LabelledSimpleDataExample`'s for which there are two fields, namely: `text` and `reference_label`. Calling the `.run()` method of the `DiffPrivateSimpleDatasetPack` will ultimately produce a new `LabelledSimpleDataset` containing privacy-safe, synthetic `LabelledSimpleDataExample`'s.\n",
    "\n",
    "### This notebook:\n",
    "In this notebook, we create a privacy-safe, synthetic version of the AGNEWs dataset. This raw AGNEWs data was used to create a `LabelledSimpleDataset` version from it (see `_create_agnews_simple_dataset.ipynb`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a3a3a21f-b6d0-4a3d-9a2f-e9fd7d9e9e0a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
   "source": [
    "%pip install treelib -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1e08d647-343f-4278-882b-1c080e719310",
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6d17533f-fe63-4547-b3ce-cf8f76bec599",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.instrumentation.span_handlers import SimpleSpanHandler\n",
    "import llama_index.core.instrumentation as instrument\n",
    "\n",
    "span_handler = SimpleSpanHandler()\n",
    "dispatcher = instrument.get_dispatcher()\n",
    "dispatcher.add_span_handler(span_handler)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "013f873a-5157-4c39-a277-01ced6f7de9c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.llama_dataset.simple import LabelledSimpleDataset\n",
    "from llama_index.packs.diff_private_simple_dataset.base import PromptBundle\n",
    "from llama_index.packs.diff_private_simple_dataset import DiffPrivateSimpleDatasetPack\n",
    "from llama_index.llms.openai import OpenAI\n",
    "import tiktoken"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53a3bed2-d702-4765-8366-cdedfafc10e2",
   "metadata": {},
   "source": [
    "### Load LabelledSimpleDataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a4c06c01-5450-43e5-bd31-837ec02dea33",
   "metadata": {},
   "outputs": [],
   "source": [
    "simple_dataset = LabelledSimpleDataset.from_json(\"./agnews.json\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d4d98604-fc39-4ff5-8396-6d70a039bebc",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>reference_label</th>\n",
       "      <th>text</th>\n",
       "      <th>text_by</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Business</td>\n",
       "      <td>Wall St. Bears Claw Back Into the Black (Reute...</td>\n",
       "      <td>human</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Business</td>\n",
       "      <td>Carlyle Looks Toward Commercial Aerospace (Reu...</td>\n",
       "      <td>human</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Business</td>\n",
       "      <td>Oil and Economy Cloud Stocks' Outlook (Reuters...</td>\n",
       "      <td>human</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Business</td>\n",
       "      <td>Iraq Halts Oil Exports from Main Southern Pipe...</td>\n",
       "      <td>human</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Business</td>\n",
       "      <td>Oil prices soar to all-time record, posing new...</td>\n",
       "      <td>human</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  reference_label                                               text text_by\n",
       "0        Business  Wall St. Bears Claw Back Into the Black (Reute...   human\n",
       "1        Business  Carlyle Looks Toward Commercial Aerospace (Reu...   human\n",
       "2        Business  Oil and Economy Cloud Stocks' Outlook (Reuters...   human\n",
       "3        Business  Iraq Halts Oil Exports from Main Southern Pipe...   human\n",
       "4        Business  Oil prices soar to all-time record, posing new...   human"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "simple_dataset.to_pandas()[:5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "afe67738-12d5-4a85-aa70-0510bc772038",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "reference_label\n",
       "Business    30000\n",
       "Sci/Tech    30000\n",
       "Sports      30000\n",
       "World       30000\n",
       "Name: count, dtype: int64"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "simple_dataset.to_pandas().value_counts(\"reference_label\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d2baaff8-6c3d-45bf-8e24-0c59d11bd70c",
   "metadata": {},
   "source": [
    "### InstantiatePack"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c0ec7d2-6455-410c-ae39-5ae74d87bc54",
   "metadata": {},
   "source": [
    "To construct a `DiffPrivateSimpleDatasetPack` object, we need to supply:\n",
    "\n",
    "1. an `LLM` (must return `CompletionResponse`),\n",
    "2. its associated `tokenizer`,\n",
    "3. a `PromptBundle` object that contains the parameters required for prompting the LLM to produce the synthetic observations\n",
    "4. a `LabelledSimpleDataset`\n",
    "5. [Optional] `sephamore_counter_size` used to help reduce chances of experiencing a `RateLimitError` when calling the LLM's completions API.\n",
    "6. [Optional] `sleep_time_in_seconds` used to help reduce chances of experiencing a `RateLimitError` when calling the LLM\"s completions API."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6671b450-04c5-4aa4-bbb3-90d2c2acf4fa",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(\n",
    "    model=\"gpt-3.5-turbo-instruct\",\n",
    "    max_tokens=1,\n",
    "    logprobs=True,\n",
    "    top_logprobs=5,  # OpenAI only allows for top 5 next token as opposed to entire vocabulary\n",
    ")\n",
    "tokenizer = tiktoken.encoding_for_model(\"gpt-3.5-turbo-instruct\")\n",
    "\n",
    "prompt_bundle = PromptBundle(\n",
    "    instruction=(\n",
    "        \"Given a label of news type, generate the chosen type of news accordingly.\\n\"\n",
    "        \"Start your answer directly after 'Text: '. Begin your answer with [RESULT].\\n\"\n",
    "    ),\n",
    "    label_heading=\"News Type\",\n",
    "    text_heading=\"Text\",\n",
    ")\n",
    "\n",
    "dp_simple_dataset_pack = DiffPrivateSimpleDatasetPack(\n",
    "    llm=llm,\n",
    "    tokenizer=tokenizer,\n",
    "    prompt_bundle=prompt_bundle,\n",
    "    simple_dataset=simple_dataset,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "866c5846-cb57-4748-aaa4-da1473b90438",
   "metadata": {},
   "source": [
    "To generate a single synthetic example, we can call the `generate_dp_synthetic_example()` method. Synthetic examples are created for a specific `label`. Both sync and async are supported. A few params are required:\n",
    "\n",
    "- `label`: The class from which you want to generate a synthetic example.\n",
    "- `t_max`: The max number of tokens you would like to generate (the algorithm adds some logic per token in order to satisfy differential privacy).\n",
    "- `sigma`: Controls the variance of the noise distribution associated with differential privacy noise mechanism. A value of `sigma` amounts to a level of `epsilon` satisfied in differential privacy.\n",
    "- `num_splits`: The differential privacy algorithm implemented here relies on disjoint splits of the original dataset.\n",
    "- `num_samples_per_split`: The number of private, in-context examples to include in the generation of the synthetic example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7f6f266c-5c7e-4d7f-b690-bbaf30ac6bb2",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|█████████████████████████████████████████████████████████████████████████████████████| 35/35 [00:33<00:00,  1.05it/s]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "LabelledSimpleDataExample(reference_label='Sports', text='The 2019 World Series has been won by the Washington Nationals, who defeated the Houston Astros in Game 7 with a final score of 6-2', text_by=CreatedBy(model_name='gpt-3.5-turbo-instruct', type=<CreatedByType.AI: 'ai'>))"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dp_simple_dataset_pack.generate_dp_synthetic_example(\n",
    "    label=\"Sports\", t_max=35, sigma=0.1, num_splits=2, num_samples_per_split=8\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5ef627b7-5226-4aa7-a158-7f9836e26480",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|█████████████████████████████████████████████████████████████████████████████████████| 35/35 [00:33<00:00,  1.04it/s]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "LabelledSimpleDataExample(reference_label='Sports', text='The 2018 Winter Olympics in Pyeongchang, South Korea, have come to a close, and the United States has once again dominated the medal count', text_by=CreatedBy(model_name='gpt-3.5-turbo-instruct', type=<CreatedByType.AI: 'ai'>))"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "await dp_simple_dataset_pack.agenerate_dp_synthetic_example(\n",
    "    label=\"Sports\", t_max=35, sigma=0.1, num_splits=2, num_samples_per_split=8\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f125840e-3269-41b7-87bc-1e5ea013068c",
   "metadata": {},
   "source": [
    "To create a privacy-safe, synthetic dataset, we call the `run()` (or async `arun()`) method. The required params for this method have been priorly introduced, with the exception of `sizes`.\n",
    "\n",
    "- `sizes`: Can be `int` or `Dict[str, int]` which specify the number of synthetic observations per label to be generated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cb0bd383-2ecd-4edb-bea2-bd5fc1a663c8",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|█████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:17<00:00,  1.77s/it]\n",
      "100%|█████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:18<00:00,  1.83s/it]\n",
      "100%|███████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:18<00:00,  9.14s/it]\n"
     ]
    }
   ],
   "source": [
    "synthetic_dataset = await dp_simple_dataset_pack.arun(\n",
    "    sizes={\"World\": 1, \"Sports\": 1, \"Sci/Tech\": 0, \"Business\": 0},\n",
    "    t_max=10,  #\n",
    "    sigma=0.5,\n",
    "    num_splits=2,\n",
    "    num_samples_per_split=8,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "346677df-038f-400c-bc21-69d45705f212",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>reference_label</th>\n",
       "      <th>text</th>\n",
       "      <th>text_by</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>World</td>\n",
       "      <td>As tensions continue to rise on</td>\n",
       "      <td>ai (gpt-3.5-turbo-instruct)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Sports</td>\n",
       "      <td>The latest sports news: The New</td>\n",
       "      <td>ai (gpt-3.5-turbo-instruct)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  reference_label                             text  \\\n",
       "0           World  As tensions continue to rise on   \n",
       "1          Sports  The latest sports news: The New   \n",
       "\n",
       "                       text_by  \n",
       "0  ai (gpt-3.5-turbo-instruct)  \n",
       "1  ai (gpt-3.5-turbo-instruct)  "
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "synthetic_dataset.to_pandas()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "99f2a3f2-7cd0-4b75-95bf-67ee6f7603b5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DiffPrivateSimpleDatasetPack.generate_dp_synthetic_example-92029346-818a-4466-9bf8-add83b0e43f4 (33.304743)\n",
      "└── DiffPrivateSimpleDatasetPack.agenerate_dp_synthetic_example-29f82dd5-d7ef-4cdb-9990-2096136ab139 (33.302189)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-25cc7261-9d65-4abd-b37b-70d877ee2194 (0.089156)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-d9a79a57-b4e4-4ed1-9884-beb006849927 (0.007764)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-5a2950f7-3cff-4103-972f-4458390b57ea (0.085112)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-93075908-1732-4594-bd41-c69fe7e320ee (0.007132)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-1fadb3e0-4339-424a-9806-8cb313968d74 (0.090871)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-830b2f0b-f254-4c56-aba0-467405a6c7a7 (0.00719)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-bb31ab4c-b377-405a-a1d3-03cfb2f95faa (0.075273)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-9234d4fe-29ff-4273-a0a7-add0a86f55e2 (0.007264)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-8547f70f-dc90-4b4d-a1e0-62996eee21e7 (0.067587)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-57798e79-ccb9-4593-b6e7-918e99550f0a (0.007257)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-e1f0e334-ffdb-4ff3-adbc-3ad454854091 (0.071019)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-530696b0-e6da-484d-9cec-8ba69042f618 (0.007118)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-1c107650-1f08-49a0-96ac-7c0083313b06 (0.343486)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-b22723b4-11c9-4cd8-b2c5-30f946b1db86 (0.007175)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-4b527e2e-c7f9-4ad3-b3ad-42c9e7d4fb97 (0.093578)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-6fd60e9e-bb50-4e59-9dc2-6d66ebedcbe9 (0.00675)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-9f258d61-6ed7-4e38-8118-e9b9d946a34b (0.098329)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-e736c62b-523c-4da9-8ff1-da3cf0ef5d6f (0.007293)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-c7ceab6b-09a8-4e29-9805-02f4b894469e (0.077786)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-84244eeb-7488-47c0-ab3d-23bcc804ec8f (0.007319)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-3f92f0ec-bc05-4951-881e-ccabf475a3bf (0.081149)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-a7e315c0-828e-4051-b4da-3ecc515a5cdb (0.007011)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-44607240-d46c-4f92-bd34-127eec8445e3 (0.075024)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-e91bcb75-9100-4639-9e36-4b1d67faf03a (0.007338)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-d7c5e05f-e536-48a7-928e-d6f733522121 (0.070831)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-1c028228-88c1-49cb-a0e0-827910b7f2d9 (0.007097)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-abc1a11b-bf16-4fb3-ac38-b0ad36b55f22 (0.073025)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-8ff0e917-56b0-408c-af53-ec1d4ad6eb50 (0.007085)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-757963c5-2b56-4b76-9637-d852f77e66b3 (0.0834)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-fc0e7a0d-e16c-4dcd-a733-e05c877d8bcf (0.007399)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-07a0c5b1-c8af-4c23-a968-9e2958daf0f9 (0.302865)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-6a8a812d-f684-444e-9ff8-7cb07c519ab1 (0.007258)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-1c43335d-aa76-4fd6-9c0f-46c34efaf1a4 (0.095691)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-880df617-6b1c-44bc-b96d-82c906a854b9 (0.007281)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-4e05a68b-a97a-49bd-9ecf-444c7b6d1033 (0.074367)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-4bbfbc2f-c3c7-48d8-b3bc-728c5cff50e0 (0.007374)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-ee590d0e-9982-4c6b-91c2-966acac7fb27 (0.075977)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-83f4b62f-65a6-4e21-a703-172517d78cd9 (0.008207)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-297ef668-fa3e-458d-9664-57086c428e24 (0.092614)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-51468128-fd3f-407a-a02e-f8d322741719 (0.008345)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-c843cf9e-7a0a-4217-81e1-dfcf1c06bba3 (0.076858)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-39b38278-d0a3-41fe-a9a9-f34931d47f03 (0.00737)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-75425ffc-7b4c-4807-b423-d6f9664ae7cc (0.08149)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-c4d59b9d-e44c-4d23-9cf4-9a2957764dfc (0.007738)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-83ff44ce-e249-4db9-ac36-b173428c8e6a (0.076796)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-5a30034f-50ac-4b77-83f5-5a345e769c86 (0.007352)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-4bfd87d7-bc18-4a63-adf5-9ecc9ef2d0ba (0.083827)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-4b8ad076-3825-48e7-8cb6-1d61f963be0c (0.007616)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-44a1ed42-d4c3-4c2e-bbc4-35b24cd7fdde (0.326301)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-f5717628-ef46-4098-a0a5-05ece49f5d3d (0.006896)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-4b047449-a5d2-478f-9e4b-11441bfe70e2 (0.079203)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-e796fdf2-72d3-4663-881d-533b1356ce3f (0.007302)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-5787a402-1c53-4638-8d81-5e9f6f9ac422 (0.076338)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-d9f0ba49-c993-44f7-819c-c61d3532afaa (0.00723)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-7cc95d28-6589-49b8-893d-96edcf65e5e4 (0.073529)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-993ffde5-dd32-4bce-9fdd-49d45bcc8803 (0.006992)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-613d8a2b-57c2-43e7-98d0-eedca0731607 (0.079526)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-c61f86de-a36f-4b74-a3ce-66d81b2d1b01 (0.007467)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-700f2789-b6ed-45dc-968c-cbf4caba30fd (0.073845)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-b6f79ea7-de86-43b3-b99f-5b571da0bcaa (0.007455)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-6a0893e1-fe75-4704-9aca-09f17e38c8f8 (0.072421)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-9457a82d-d229-437e-ab89-b798ba7e1efb (0.006903)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-8da08e3f-8bd7-4004-92a9-92c4d0a0462d (0.08159)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-920bb5ae-79b1-4b14-b91d-b0267b6ddf66 (0.007066)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-5349c9df-6989-4375-88b4-e1f9da8933c2 (0.088786)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-40b0f463-ed17-4f5c-8403-3b5991d8876a (0.006883)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-9ff4a67e-7c5c-49c8-a2ae-84d2973cb10d (0.059426)\n",
      "    ├── DiffPrivateSimpleDatasetPack._split_dataset-3bce4d27-0c39-4a99-baa7-4a2e67f8bbf3 (0.007046)\n",
      "    ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-457312e2-90ac-4e09-acd9-d42d14d7a8b2 (0.081913)\n",
      "    └── DiffPrivateSimpleDatasetPack._split_dataset-5b34c307-1c4b-406f-91d8-e2f3edc5becb (0.006948)\n",
      "\n",
      "\n",
      "DiffPrivateSimpleDatasetPack.agenerate_dp_synthetic_example-7d75e009-3852-4351-97a1-2a87713ba359 (33.600234)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-a4567a8e-c2e8-4f41-b253-e91e8bc59997 (0.08754)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-00ea9dbb-130e-4c12-b637-037e6b41aea5 (0.007304)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-e221bc6b-0f62-4280-aad9-79a489d98ea8 (0.082097)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-6566174e-77b0-42b7-ba85-8e6f2865e16f (0.007384)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-87db2daa-0014-470c-a3bc-beed3aa030da (0.093534)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-f9bb611c-9fe2-41dd-971a-3eb9cc170f03 (0.006897)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-4670c297-9480-45a5-b53a-8d38b264d432 (0.092346)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-ca1ee763-3caa-4df7-b3ee-5c98a7c17ecc (0.007339)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-d64ccc24-61e6-4880-a1f8-7866ba054654 (0.090525)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-fd6e3791-ef88-44e0-bf9b-aec27b956ebe (0.006901)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-9dc79e58-e1fc-4bc5-b8f5-ccc40e729d7f (0.079242)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-cc1f6a66-56f1-4473-aee1-761b54a133e7 (0.007058)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-86342ea3-6044-49fa-ad39-1cb3242d8879 (0.305845)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-c21eac65-c43d-42e5-ab29-9b442e966356 (0.007365)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-f9423189-070a-4c8c-91a5-e0b8e5856a13 (0.078745)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-1d2ee45a-6465-4a63-adbb-11e72c823dc9 (0.00772)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-755d12dc-9d56-4a78-897d-d4ffcf924359 (0.079349)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-f4cae7ff-1b6a-40c0-bd55-c2e134b33926 (0.007101)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-3f374fd8-c5fb-4723-9284-54f2168a6224 (0.07167)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-70b48089-1bcb-4f29-8e85-10d95156f888 (0.007246)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-2428344e-127d-4e0a-958d-3b8b197d4d75 (0.072421)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-55d668d9-90b1-4eb7-9142-97e83c589710 (0.007112)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-06b72816-0882-4bd0-b098-27ebfa86309c (0.079857)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-30c847f1-7671-4ebe-9a7c-4b41e704a542 (0.007074)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-6b058e75-b729-4aab-911e-94bd733f2f86 (0.096461)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-47ab6590-0ea2-4198-8449-67092485045e (0.006968)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-285ca04e-0ac3-4ae9-b4f5-b44af7a4995a (0.092171)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-da3cdc88-a797-4dfc-aa75-04a433f7df5b (0.007111)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-fa24a4bd-6f3a-456e-a9f0-2291dde8131e (0.072086)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-26d67478-b2e0-41dd-b524-238edb277453 (0.007162)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-58f9325a-a540-4d18-ad81-71b03c932abf (0.337936)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-8ce6cef1-1d7a-4c8b-a51f-0ddeaf90bb7e (0.007014)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-42b85ddf-52da-41e7-87b2-cd32717349f5 (0.07497)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-43a6af50-c9b1-4ae1-ad48-5eef1881edeb (0.007219)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-2470e20e-5367-4e89-9e32-5bf286d0968d (0.092514)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-2b44f83d-56ef-4f79-992a-a04051d2ff28 (0.0069)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-c6b88cee-d66a-4bbb-b2cc-d2084b3043ea (0.089704)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-59828054-29fc-4cf5-b857-afec80ab3a47 (0.007084)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-f7e9f690-a89c-4fc1-9f59-d3b5f85fc838 (0.078206)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-cf5b4ddc-9caf-4642-819c-00cfd4185844 (0.007137)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-d5272bdc-14cd-4654-97de-507779fcb429 (0.073427)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-f6493915-dd74-4962-813b-76298543b01c (0.007202)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-ef86ee9b-d174-4af5-abb4-059552498d23 (0.084554)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-3bfa5a98-703d-4326-b6af-ce20430ca683 (0.007365)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-340b25cc-7a5d-4bf9-9c7d-695b5a8daaf9 (0.081083)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-4fcb1a9d-1b0e-4078-8038-bb4ab028d322 (0.007033)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-5bdaa44d-8fa2-4456-a239-85a22c25c130 (0.089066)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-d0f5f71e-d66e-41b5-aa3a-292ae1ce9c0c (0.006923)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-e5b09fd9-436d-4b28-a0e4-4b74a2bd39bb (0.334729)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-fcabe1aa-86b8-4a9c-bfac-4834e4c6903c (0.008135)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-3e3b18fc-0720-40aa-9007-340b895fa4af (0.090975)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-97cefc97-6140-4758-aace-c4c0839f0227 (0.007178)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-66ea13e0-a943-47f4-bb42-81cdde18d82a (0.096352)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-e1f40787-2044-49d1-9306-7d8f727b655c (0.007244)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-596c989f-b39a-40e3-a88c-aa0c5219d315 (0.074101)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-ef629a0e-27ca-4b7e-ad0d-9ba972b5c7fb (0.007138)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-8a0947c5-ed73-4497-9b5f-016d5ee20c94 (0.084003)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-d9ba8042-73c3-43d7-a057-325b8ebdb077 (0.007059)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-6c58bcd5-c5c7-46b2-9246-7de01baa000a (0.092873)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-5c1e9032-0d2b-46eb-b62b-29d3d84b910e (0.007294)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-7cd7e5ac-b1be-428f-abf5-57aefaccbd94 (0.073786)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-82165b59-7e89-4bc6-b02b-5b3b46518cd2 (0.006995)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-82f0f27d-3f95-40a6-a75b-d2feb53ad221 (0.075927)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-bce47e18-0603-4bbf-a442-f9ab65166a25 (0.006877)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-1298bf15-a645-4ff2-8732-64a0c96314dd (0.085454)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-ad460bfd-05a5-4baf-bbf6-42540c263ec2 (0.007137)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-83baad09-b690-4471-ab09-15dff1e366d3 (0.281037)\n",
      "├── DiffPrivateSimpleDatasetPack._split_dataset-51d2caa6-233f-425e-acff-6f9ba5f8e3a1 (0.007041)\n",
      "├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-e0dc03c7-7f6e-467d-ae43-f56607cd8ee5 (0.095103)\n",
      "└── DiffPrivateSimpleDatasetPack._split_dataset-387d0d77-c38a-4ab5-a8d0-f19c1825f730 (0.007039)\n",
      "\n",
      "\n",
      "DiffPrivateSimpleDatasetPack.arun-cf61ecbe-6959-42ce-9451-9d1085b858b2 (18.280578)\n",
      "└── DiffPrivateSimpleDatasetPack.agenerate_dp_synthetic_example-907cfb9d-a039-49e9-8070-7d2b224fd6ae (17.660056)\n",
      "    └── DiffPrivateSimpleDatasetPack.agenerate_dp_synthetic_example-88b5a06b-8bd9-4897-a093-afbbae21ecf4 (18.264956)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-b0d584f3-6fd2-4db9-860d-2f3390ca7f3b (0.109113)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-bf6a08d6-9370-40a0-afb2-9330397c2a85 (0.007292)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-49a8d913-a9c1-4db6-8839-b9ce3a99f99c (0.095109)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-fe0546f6-24ba-46eb-baf3-be8ba38035e5 (0.007408)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-72a22a46-51a5-409b-a05d-1f6dd403baec (0.078942)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-9093b1ed-0648-4141-90dc-fc22d6368941 (0.007116)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-d388adfd-67cd-4298-b3b6-251ec5eb392a (0.090168)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-6513de9c-8522-4b03-a979-adee38cfcdc8 (0.006997)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-53ddb4cd-f657-4b11-ad4b-e2a4f2e80eeb (0.09568)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-1b2f3384-cb48-4a19-8cf6-935f7c3f4ff3 (0.007192)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-7226333a-6a30-458f-99d7-17411f775fec (0.097329)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-8a228f0c-9014-4194-b7e7-9e06e56b1d04 (0.00686)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-38d4d861-bd9e-4c5e-9aed-67a76ae842d0 (0.373755)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-476d1d70-cc7f-46c3-b253-ba197e9449bb (0.007306)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-4dcbc0f5-f072-4dc9-ac39-aa15f8666a54 (0.094175)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-833260d4-6aa3-497b-b73d-ee842645b728 (0.006641)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-d224b8a6-380b-4846-bdf1-b8bbe2fe4258 (0.093407)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-d99e6aa5-c35f-4483-b8e4-1d6dd3fb60c8 (0.006974)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-fd27b68d-3e8d-4200-979e-f0737bac310b (0.074941)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-c5f933e1-43ff-448f-a8e8-f5cb61416274 (0.007085)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-d062bb6b-796b-43a4-a3e3-0ae33d27a391 (0.077881)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-6fe03b7f-d291-48c0-86e4-12dddba3eca7 (0.007596)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-1234ee77-6522-44fa-b517-9006b01a8723 (0.075711)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-4bd5cdfe-c1c8-42fb-bfd4-91294943c81b (0.007183)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-4bb2f156-703b-49b0-b1a5-3a0a48df21d2 (0.080322)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-64e5c408-f647-4948-92a7-c23336a3d193 (0.007894)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-95231893-0125-4b5a-85f6-4714c2168d84 (0.077434)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-e5b5b9bd-dd1a-481f-b5d1-648730ef4be4 (0.007644)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-9a5c177c-1029-43ea-8cfe-e2cd42c8e6a5 (0.081067)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-1c3c9ee4-e754-44a5-b1bc-a451107fc7ac (0.006925)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-f23c7159-aea6-49d0-b46e-7444bd2c20bb (0.309341)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-7f35be47-ddcf-4a05-8d2a-fe82585231a3 (0.007125)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-d84c7f69-f393-474f-82b0-9af3a3e603cd (0.096701)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-ac37f7c2-4abb-4ecf-8bd5-fcab57de970d (0.006929)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-bb63d809-5aac-4dd0-8f34-36021b3205eb (0.088839)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-c8e85519-03f5-4a3f-aba6-65a4d70424e8 (0.006788)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-4169301f-f43a-48b9-91de-5e74d50465a7 (0.088204)\n",
      "        ├── DiffPrivateSimpleDatasetPack._split_dataset-f98944e8-ec31-45ba-a027-ec4642208748 (0.011754)\n",
      "        ├── DiffPrivateSimpleDatasetPack._filter_dataset_by_label-b988b2bb-1ada-47cb-b285-999abaf0c571 (0.063606)\n",
      "        └── DiffPrivateSimpleDatasetPack._split_dataset-ddcc647b-4839-4d1d-963e-ca9df14d9854 (0.006956)\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "span_handler.print_trace_trees()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "diff-privacy",
   "language": "python",
   "name": "diff-privacy"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
